Analysis B

Analysis Techniques for Computational Material Science

November 6, 2024

Computational Material Science

  • modelling system under periodic boundary conditions

  • describing structural properties

  • comprehension thermal and mechanical properties

  • virtual screening which encompasses the development of advanced functional materials:

    • anodes, cathods,
    • solar cells,
    • catalysts,
    • bioactive glasses,
    • sensors,
    • alloys materials etc.

Analysis of Data

It is important to be aware of the accuracy and limitations of the computational methods employed in order to ensure the reliability of the results obtained.


Question

What do you think are the limitations of a NNP?

Answer

  • Modelling something out of the training set can be challenging.
  • Need of GPUS

Analysis of Data

It is important to be aware of the accuracy and limitations of the computational methods employed in order to ensure the reliability of the results obtained.


Question

What is your assessment of the accuracy of NNP methods?

Answer

The degree of accuracy is depending on the quality and size of the training set.

Preanalysis

  • Simulation finished without errors: no abort, no error messages appear


  • Visual inspectation of the system: system behave properly


  • Equilibrium state of simulation

Visual inspection

Check the system is working like expected.

Open the *.xyz file with VMD.


It is a good sign if

Troubleshooting

In the event that the simulation does not produce the anticipated results:

Equilibrium state of simulation

Firstly, in order to obtain meaningful physical properties from a simulation, it is essential to ensure that the system is in an equilibrium state.

Question

However, what constitutes an equilibrium state?

Answer

Equilibrium state can be defined as a state in which macroscopic properties \(A\) (e.g temperture \(T\), pressure \(P\), energies \(E\), volume \(V\),…) do NOT undergo major changes over time.

Once these properties fluctuate around a constant pattern without any long-term trends, it can be assumed that the system has reached an equilibrium state.

Plot of temperature over simulation time

Figure 1: Temperature vs Time plot
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats import linregress
import matplotlib as mpl

mpl.rcParams["font.size"] = 15
mpl.rcParams["axes.labelsize"] = 20


physical_properties = ["SIMULATION-TIME", "TEMPERATURE", "PRESSURE", "E(TOT)", "E(QM)", "N(QM-ATOMS)", "E(KIN)", "E(INTRA)","VOLUME","DENSITY","MOMENTUM","LOOPTIME"]
physical_units = ["fs", "K","bar", "kcal/mol", "kcal/mol", "-","kcal/mol","kcal/mol","A^3", "g/cm^3", "amuA/fs", "s"]
data = pd.read_csv("../data/physical_properties.en",sep='\t+',names=physical_properties)
# print(data.head())
# print("Shape of data: ", data.shape)
time = data["SIMULATION-TIME"]/1e6 # ns
temperature_avg = np.mean(data["TEMPERATURE"])
temperature_std = np.std(data["TEMPERATURE"])
window = 50
gap = 2
running_average = data["TEMPERATURE"].rolling(window=window).mean() #window=window,min_periods=1,center=True,step=gap

def linear_function(x,a,b):
  return a*x+b



plt.figure(figsize=(8,4))
plt.plot(time,data["TEMPERATURE"],color="black", alpha=0.5,label=r"simulation")
plt.plot(time,running_average,label=r"$T_{\mathrm{run.avg}}$")
plt.hlines(temperature_avg,xmin=np.min(time),xmax=np.max(time),linestyles="solid",color="black",label=r"$T_{\mathrm{avg}}$")
plt.hlines(temperature_avg+temperature_std,xmin=np.min(time),xmax=np.max(time),linestyles="dashed",color="black",label=r"$T_{\mathrm{std}}$")
plt.hlines(temperature_avg-temperature_std,xmin=np.min(time),xmax=np.max(time),linestyles="dashed",color="black")

plt.xlabel(r"$t$ / ns")
plt.ylabel(r"$T$ / K")
plt.legend(fontsize=12)
plt.show()

Linear / Volumetric Thermal Expansion Coefficient

The thermal expansion coefficient is a quantity that describes the extent to which a solid substance undergoes a change in response to variations in temperature.


Therefore the system is computed in \(NPT\) ensemble at different temperatures \(T_i\) at \(1\,\mathrm{bar}\).


For our case we consider a temperature range from \((248 - 348)\,\mathrm{K}\) in \(25\,\mathrm{K}\) steps.

Linear thermal expansion coefficient

Linear thermal expansion coefficient at \(\mathrm{298}\,\mathrm{K}\) is calculated by the temperature gradient of the respective average lattice parameter \(\left < a \right >\), \(\left < b \right >\) or \(\left < c\right >\) at constant pressure \(P\) (\(NPT\) ensemble)

\[ \alpha _{x}^{T_{\mathrm{298\,K}}} = \frac{1}{x} \left ( \frac{\partial x}{\partial T} \right ) _{P}, x={a,b,c} \]


The slope can be estimated numerically in different ways. A simple way is to employ a finite difference method. There are three fundamental types of it the forward, backward and central finite difference.

Linear thermal expansion coefficient

In our case we use the central finite difference method with five-point stencil. In general it is formulated in 1D as:

\[ \begin{align} f'(x) = \frac{-f(x+2h)+8f(x+h)-8f(x-h)+f(x-2h)}{12 h} + \\ +\frac{h^4}{30}f^{(5)(c)}, c \in [x-2h,x+2h] \end{align} \] where \(h\) is the change of \(x\).


If you use more than five points higher stencil order can be used.

Linear thermal expansion coefficient

In our case the differentiation can be written as:


\[ \frac{\partial x}{\partial T} \approx \frac{ \left < x^{T_{\mathrm{248\,K}}}\right >-8 \left < x^{T_{\mathrm{273\,K}}}\right > + 8 \left < x^{T_{\mathrm{323\,K}}}\right > - \left < x^{T_{\mathrm{348\,K}}} \right > }{12\Delta T} \]


\[ x \in {a,b,c} \]

Volumetric thermal expansion coefficient

It is the same formulation as for the linear thermal expansion coefficient, except that the volume \(V\) is used instead of the individual lattice parameters:

\[ \alpha _{V}^{T_{\mathrm{298\,K}}} = \frac{1}{V} \left ( \frac{\partial V}{\partial T} \right ) _{P} \]

The volume of a orthorhomibc system can be calculated as: \[ V = a\cdot b \cdot c \]

In general of a non-orthorhombic crystal sytems the volume is calculatd by: \[ V = a b c \sqrt{1-\mathrm{cos}^2(\alpha) -\mathrm{cos}^2(\beta) - \mathrm{cos}^2(\gamma)+ 2(\mathrm{cos}(\alpha) \mathrm{cos}(\beta) \mathrm{cos}(\gamma))} \]

Linear / Volumetric Thermal Expansion Coefficient

Question

How you can estimate linear/ volumetric thermal expansion coefficient in the lab?

Answer

X-ray diffraction (XRD) powder X-ray diffraction (PXRD) neutron powder diffraction (NPD)

Plot Lattice Parameter vs. Temperature

Figure 2: T vs a plot

Interaction Energy

The interaction energy is a measurement to estimate the energy share of interaction between the guest and host system. It can be calculated by substracting the single energy shares of the individual systems (\(U_{\mathrm{guest}}\) and \(U_{\mathrm{host}}\)) from the energy of the total system (\(U_{\mathrm{guest@host}}\)): \[ U_{\mathrm{int}} = \left< U_{\mathrm{guest@host}} \right > - \left< U_{\mathrm{host}} \right > - \left< U_{\mathrm{guest}} \right >. \]

In doing so you take the average total energy \(U=E_{\mathrm{kin}}+U_{\mathrm{pot}}\) of the respective simulated systems (guest@host, host, guest) at a long enough equilibrated trajectories.

Radial Distribution Functions (RDFs)

The radial distribution function (RDF) is a simple measurement to get structural information of the system.

Radial Distribution Functions (RDFs)

It describes the probability \(P_{ab}(r)\) to find a target particle type \(b\) at a distance \(r\) away from a reference particle type \(a\) e.g. the probability to find \(\mathrm{H}_2\)from \(\mathrm{Zn}^{2+}\) clusters in a distance \(r\): \[ P_{ab}(r) = \int_{0}^{r'} 4\pi r'^2 g_{ab}(r') dr' \] where \(g_{ab}(r)\) is the RDF. The RDF formulats the average local density \(\left <\rho_{ab}(r)\right >\) normalized by \(\rho=(N_a N_b)\) \[ g(r) = \frac{\left <\rho_{ab}(r)\right >}{\rho}. \]

Radial Distribution Functions (RDFs)

The two-particle density correlation function is defined as:

\[ \rho_{ab}(r) =\sum_{i=1}^{N_a} \sum_{j=1}^{N_b} \left < \delta(|\vec{r_{i}}-\vec{r_{j}}| -r)\right >. \]

The average number of particles \(b\) that can be found in the shell of distance \(r\) can be determined by multiplying the density \(\rho\) with the probability \(G(r)\) \[ N_{ab}(r) = \rho G_{ab}(r) \]

Radial Distribution Functions (RDFs)

Dirc-Delta Function

\[ \delta(x-x_0) = \begin{cases} 0 & x \neq x_0\\ \infty & x = x_0\\ \end{cases} \] \[ \int_L \delta(x-x_0) dx = 1, x_0 \in L \]

Self-Diffusion Coefficient

In order to describe the dynamic of guests in host systems, it is necessary to calculate transport properties. A transport property would be for example the diffusion, viscosity, electrical or thermal conductivity.

In general transport properties can be described as a property coefficient \(\gamma\) which depends on microscopic variable \(A\) (e.g. positions \(\Delta \vec{r}\) or the velocities \(\vec{v}\)) that is written as an infinite time integration of a non-normalized time correlation function \(\left < \dot{A}(t)\dot{A}(t_0) \right >\) : \[ \gamma = \int_{0}^{\infty} dt \left < \dot{A}(t)\dot{A}(t_0) \right >. \]

Self-Diffusion Coefficient

This is also known as Green-Kubo relation which can also written as equivalent Einstein expression

\[ \gamma = \lim_{t\to\infty} \frac{\left < (A(t) - A(t_0))^2 \right >}{2t} = \frac{1}{2} \lim_{t\to\infty} \frac{d}{dt} \left < (A(t) - A(t_0))^2 \right > \]

Self-Diffusion Coefficient

In case of self-diffusion coefficient \(D_{\mathrm{s}}\) the variable \(A\) is the molecular position \(\vec{r}_{\mathrm{cm}}\) which in general is the center of mass position of the particles: \[ \vec{r}_{\mathrm{cm}} = \frac{\sum_{i=1}^{N} m_i \vec{r}_i}{\sum_{i=1}^{N} m_i} \] where \(m_i\) is the atomic mass, \(\vec{r}_i\) is the atomic positions and \(N\) the number of atoms in the particle.

As well as \(\dot{A}\) is the molecular velocity which should be also considered as center of mass.

Einstein-Relation

The Einstein Relation can be therefore written as slope of the mean-squared-displacement (\(MSD(\tau)\)) over time origins \(\tau\)

\[ D_{\mathrm{s}} = \frac{1}{2 \cdot d} \lim _{t\to \infty} \frac{d}{dt} MSD(\tau) \] where \(d=1,2,3\) is the dimensionality.

Mean-Squared Displacement

The mean-squared displacement describes the the temporally displacement of the particle from a time origin \(t_0\) averaged over the time interval \(\tau\) \[ MSD(\tau) = \left < \frac{1}{N} \sum_{i=1}^{N} |\vec{r} _{i}(t) - \vec{r} _{i}(t_0)|^2 \right >_{\tau} \] where \(N\) is the number of the particle, \(\vec{r}_{i}\) is the center-of-mass position vector of the particle \(i\).

Mean-Squared Displacement

Looking at the Figure Figure 3 (a), we see that the beginning of the MSD is not linear. Only at higher correleation times the normal diffusion (see Figure Figure 3 (b)) behaviour appear.

Question:

  • How do you get the self-diffusion coefficient out of the MSD regarding the Einstein-relation?
  • How do you get the gradient/slope of the MSD?
  • What does the limes in that formular mean?

Answer:

The self-diffusion coefficient is the slope of the MSD divided by 2 \(\cdot d\).

But this is only true if correlation time is long enough. We can say that in the linear regime we fulfill this limes due to unchanging slope which represents the difffusion coefficient.

Therefore it is important to have really linear behaviour in order to apply the Einstein relation.

You get the slope by fitting a linear regression at the last linear section.

Mean-Squared Displacement

Figure 3: (a) MSD(t) plot, (b) diffusion regimes

Green-Kubo Relation

The Green-Kubo formalism needs a numerical integration of the velocity-autocorrelation function \(VACF(t)\) to get the self-diffusion coefficent \(D_{\mathrm{s}}\) \[ D_{\mathrm{s}} = \frac{1}{d} \int _{0}^{\infty} VACF(t) dt \] where \(d=1,2,3\) is the dimension.

Veloctiy-Autocorrelation Function

The Velocity-Autocorrelatin Function \(VACF(\tau)\) is averaged over a time interval \(\tau\) \[ VACF(\tau) = \left < \frac{1}{N} \sum_{i=1}^{N} |\vec{v} _{i}(t) \vec{v} _{i}(t_0)|^2 \right >_{\tau} \] where \(\vec{v}_{i}\) is the velocity of the particle \(i\) and \(t_0\) labels the time origin.

Veloctiy-Autocorrelation Function

Figure 4: VACF(t) plot

Activation Energy

The diffusion coefficient is directly depending on the activation energy \(E_{\mathrm{a}}\) by an exponential decacy \[ D_{\mathrm{s}}=D_0 e^{-\frac{E_{\mathrm{a}}}{RT}} \] where \(D_0\) is factor of the exponential function, \(R=8.314\,\mathrm{JK^{-1}mol^{-1}}\) is the gas constant and \(T\) is the temperature.

With this formalism we can apply an Arrhenius equation to fit a linear function to get the activation energy \(E_{\mathrm{a}}\) \[ \ln (D_{\mathrm{s}}) = -\frac{E_{\mathrm{a}}}{RT} + \ln{D_0} \]

Activation Energy

Figure 5: (a) D(T), (b) D(n), (c) ln(D(1000/T) and (d) Ea(n))